A Tale about PRO and Monsters
نویسندگان
چکیده
While experimenting with tuning on long sentences, we made an unexpected discovery: that PRO falls victim to monsters – overly long negative examples with very low BLEU+1 scores, which are unsuitable for learning and can cause testing BLEU to drop by several points absolute. We propose several effective ways to address the problem, using lengthand BLEU+1based cut-offs, outlier filters, stochastic sampling, and random acceptance. The best of these fixes not only slay and protect against monsters, but also yield higher stability for PRO as well as improved testtime BLEU scores. Thus, we recommend them to anybody using PRO, monsterbeliever or not. 1 Once Upon a Time... For years, the standard way to do statistical machine translation parameter tuning has been to use minimum error-rate training, or MERT (Och, 2003). However, as researchers started using models with thousands of parameters, new scalable optimization algorithms such as MIRA (Watanabe et al., 2007; Chiang et al., 2008) and PRO (Hopkins and May, 2011) have emerged. As these algorithms are relatively new, they are still not quite well understood, and studying their properties is an active area of research. For example, Nakov et al. (2012) have pointed out that PRO tends to generate translations that are consistently shorter than desired. They have blamed this on inadequate smoothing in PRO’s optimization objective, namely sentencelevel BLEU+1, and they have addressed the problem using more sensible smoothing. We wondered whether the issue could be partially relieved simply by tuning on longer sentences, for which the effect of smoothing would naturally be smaller. To our surprise, tuning on the longer 50% of the tuning sentences had a disastrous effect on PRO, causing an absolute drop of three BLEU points on testing; at the same time, MERT and MIRA did not have such a problem. While investigating the reasons, we discovered hundreds of monsters creeping under PRO’s surface... Our tale continues as follows. We first explain what monsters are in Section 2, then we present a theory about how they can be slayed in Section 3, we put this theory to test in practice in Section 4, and we discuss some related efforts in Section 5. Finally, we present the moral of our tale, and we hint at some planned future battles in Section 6.
منابع مشابه
Future Challenges of Robotics and Artificial Intelligence in Nursing: What Can We Learn from Monsters in Popular Culture?
It is highly likely that artificial intelligence (AI) will be implemented in nursing robotics in various forms, both in medical and surgical robotic instruments, but also as different types of droids and humanoids, physical reinforcements, and also animal/pet robots. Exploring and discussing AI and robotics in nursing and health care before these tools become commonplace is of great importance....
متن کاملAn Architectural Tale of the Two Cities
A comparative study of the corresponding styles of Western and Iranian modern architecture has hardly ever been carried out in detail. This paper aims to sketch out an outline for such an investigation and to present a summary of empirical evidence accompanied by field observations to elaborate the ongoing trend of relationship between architectural styles in Iran and that of the West. This is ...
متن کاملEnvironmental and Ecological Extra Challenges in Minority Populations; a Tale of Toxic Exposures among First Nation Populations
متن کامل
Creature grammar for creative modeling of 3D monsters
Monsters and strange creatures are frequently demanded in 3D games and movies. Modeling such kind of objects calls for creativity and imagination. Especially in a scenario where a large number of monsters with various shapes and styles are required, the designing and modeling process becomes even more challenging. We present a system to assist artists in the creative design of a large collectio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013